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Abstract 

Latent variable models represent a useful tool for the analysis of complex data when 
the constructs of interest are not observable. A problem related to these models is that 
the integrals involved in the likelihood function cannot be solved analytically. We propose a 
computational approach, referred to as Dimension Reduction Method (DRM), that consists of 
a dimension reduction of the multidimensional integral that makes the computation feasible 
in situations in which the quadrature based methods are not applicable. We discuss the 
advantages of DRM compared with other existing approximation procedures in terms of both 
computational feasibility of the method and asymptotic properties of the resulting estimators. 
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1 Introduction 


Latent variable models are commonly used in many research fields where it is of interest to study 
constructs that are not observable but can be indirectly measured by indicators related to them. 
Typical examples can be found in sociology where questionnaires are utilized to measure attitudes 
and opinions, in educational testing where the ability of students is evaluated using exams, and in 
medicine where quality of life is assessed by function scores (e.g., been able to walk independently). 


A general framework that includes a large variety of latent variable mode 


Generalized Linear Latent Variable Models (GLLVMs) (IMoustaki and Knott 


s is represented by the 


2000 


Skrondal and Rabe-Hesket 


2004 ). The purpose of GLLVMs is to describe the relationship between a set of p responses or items, 


denoted for each individual by the p-dimensional vector y i,l = 1,... ,n, and a set of q latent vari¬ 
ables or/and random effects, denoted by b;, that are fewer in number than the observed variables 
(q < p). The latent variables are supposed to account for the dependencies among the response 
variables, while random effects are usually introduced in the model to account for some unobserved 
heterogeneity. Typically, in these models, the assumption of conditional or local independence is 
made, which states that the observed response variables are independent given the latent variables. 
The probability associated to the individual response pattern is expressed as follows 


f(yi) = [ g(yi I bj)/i(bi)dbj, (l) 

where h(bi) is usually referred to as the structural part of the model, generally assumed to be 
normal with null mean and covariance/correlation matrix equal to T', and g(yi | b/) is usually 
referred to as the measurement part of the model. The latter is taken from the exponential family. 
For the specific case of binary items treated in this study, assuming a canonical link function, it 
results 

9 {yi I b z ) = exp {y frji - 1 T log[l + exp (77,)] } , (2) 

where rp represents the systematic component of the model defined as 


rp = oc 0 + a T h h 


with cco being a p-dimensional vector of item specific intercepts and a is a structure matrix that 
contains unknown factor loadings if associated to latent variables, and fixed coefficients if associated 
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to random effects. The measurement model can be easily extended to different kinds of observed 
variables. 


GLLVMs can be estimated using eit her the Maximum Li 


EM or the Newton-Rap hson algorithms (IMoustaki and Knott 


the Bayesian paradigm ( Dunso n. 


2000 


telihood (ML) met 


2000 ; 


Huber et al 


rod, _t 


irough the 


2004J), or under 


2003). Here, we focus on the ML estimation method. In 


this case, the maximization procedure requires the latent variables and/or random effects to be 
integrated out from the likelihood function, but analytical solutions do not exist. Different remedies 
have been proposed in the literature in order to overcome this problem. Numerical quadrature based 
methods represent a widespread solution to approximate the integrals. At this regard, the Adaptive 
Gauss Hermite (A GH) quadrature approximation is the “gold” standar d, especially in presence of 


dichotomous data (Rabe-H es keth et al 


2005 


Schilling and Boo 




2005 ). However, it presents two 


drawbacks. The first one is that it is very computationally demanding because of the need to find 
the mode of the in tegr and for each subject and at each iteration of the optimization algorithm. In 


this regard, 


Rizopoulos (2012) proposed a pseudoversion of AGH that is faster than the classical 


AGH since the quadrature locations are updated just at the first iteration of the algorithm. The 
second drawback is that AGH becomes computationally unfeasible in presence of a large number 
of latent variables. 


An alternative way to face the problem is given by the Laplace approximation (IDe Bru iinl. 


1981 ) 


that can be viewed as a particul 

1994; 

Pinhciro and Bates, 

1995) 


ar case of AGH when one quadrature point is used (ILi u and Pier cel . 


curse of dimensionality since it does not require to solve any integral. Its computational complexity 
is only related to the computation of the mode of the integrand with respect to t he latent vari¬ 
ables. The standard Laplace approximation has been applied to GLLVMs by 


Huber et al 


( 2001 ) 


who investigated the properties of the resulting estimators when the direct maximization is used. 


In the Bayesian framework, 


Rue et al. 


(120091) combined Laplace approximation with numerical 


integration to provide a fast and accurate method, the so called Integrated Nested Laplace Ap¬ 
proximation (INLA), to approximate the predictive density of the latent variables/random effects. 
This approach is particularly effective when the inverse covariance matrix of the random effects is 
sparse and the number of parameters is small. 

The simplicity of the standard Laplace approximation has a cost related to the fact that the 
order of the approximation error is 0(p -1 ), and it cannot be improved unless the number of 
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items increases. Moreover, Joe ( 20081 ) investigated the performance of the method for different 


discrete response mixed models, and he found that it becomes less adequate as the degree of 
discreteness increases. Several authors developed higher order versions of the method with the 
aim of improving the approximation error. A first e xten sion of the standard Laplace, called fully 
exponential Laplace approxima t ion, w as proposed bv lTiernev et al.l (119891 ) in the Bayesian context, 


extended by 


Rizop oulps et al 


(2009 


p joint models for time event and longitudinal data, and 


applied by iBianconcini and Cagnonel (2012) to GLLVMs. The advantage of this method is that 
it improves the approximation error to 0(p~ 2 ), but with the same computational complexity as 
the standard Laplace. The good behavior of higher (than two) o rder Laplace approximati o ns ha s 


bee n highlighted in several papers, such as IShun and McCullaghl (119951) . 


Raudenbush et ah 


and 


Evangelou et al. 


( 120001 ) . 


(1201 ll) . However, the implementation of higher order Laplace approximations 


requires to compute a large number of partial derivatives of the integrand, strictly dependent on 
the model considered. 

In this paper, we propose a computational approach to approximate the integrals involved 
in the likelihood of GLLVMs. The proposal, that we refer to as Dimension Reduction Method 
(DRM), is based on the High Dimensional Model Representation (HDMR) of continuous real 


functions discussed by 


Rabitz and A 


considered here, called Cut-HDMR dLi et al. 


is l (11999 ). HDMR admits different specifications and the one 


20011), consists of representing a target function as 


a sum of not overlapping components tha 


can be proved to be the terms of the Taylor series 
expansion of the function ( Xu and Rahmanl . 2004). The application of Cut-HDMR representation 


truncated at s (s < q) terms of the integrand in eq. (JT]) implies a dimension reduction of the 
multidimensional integral that makes the computation feasible, also when the number of latent 
variables is large, and, in particular, in common situations in which the quadrature based methods 
are not applicable. The choice of s terms involved in the expansion determines the goodness 
of integral approximation, ft has to be noticed that, when s is set equal to zero, we get the 
standard Laplace approximation. In all the other cases, the DRM method produces a reduction 
in the integral dimension, since only 1, 2,..., s- dimensional integrals are involved. We discuss 
the advantages of DRM compared with the other existing approximation procedures, that is the 
Laplace approximation and the AGH quadrature, in terms of both computational feasibility of 
the method and asymptotic properties of the resulting estimators. As for the former, we show 
that, differently from the Laplace approximation methods, DRM does not require any derivative 


4 










































computation, and hence is very simple to implement. As for the latter, we give a formal proof that 
the DRM estimators based on s terms, where s < q, are asymptotically as accurate as the AGH 
estimators. 

The performance of the method is explored by means of a wide simulation study and on real 
data. In particular, we consider data coming from a longitudinal study, the Health and Retirement 
Study (HRS), conducted by the University of Michigan with the aim of investigating the health, 
social and economic implications of the aging of the American population. We analyze the dataset 
collected on a subsample of individuals who participated to a more extensive research on Aging, 
Demographics and Memory Study (ADAMS), which represents the first population-based study of 
dementia in the United States. We first estimate a classical confirmatory factor model on the data 
collected at the first time of observation. We apply the proposed methodology and compare the 
results with those derived by applying the standard Laplace and the AGH. Then, we highlight the 
main advantages of DRM with respect to the other approximation methods through a longitudinal 
analysis of the ADAMS data set observed at different occasions. 

The paper is organized as follows. In Section 2, we illustrate the integration problem when a 
maximum likelihood estimation of the model is used, and we discuss the dimension reduction 
approximation of the likelihood function. In Section 3, we derive the asymptotic properties of the 
DRM-based estimators and compare their behavior with that of the Laplace approximation and 
AGH-based estimators. The finite sample properties of the proposed estimators are analyzed in a 
simulation study presented in Section 4. A comparison with the existing approximation methods, 
that is the standard Laplace and AGH, is also carried out. The application to the HRS data is 
illustrated in Section 5 and Section 6 concludes the paper. 

2 Inference based on the Dimension Reduction Method 
(DRM) 

Maximum Likelihood (ML) estimates in the GLLVM framework are typically obtained by using 
either the EM algorithm or the direct maximization through Newton-type algorithms. The key 
component for applying both the procedures is the score vector of the observed data log-likelihood 
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function. For a random sample of size n, the latter is defined as 


R0) = 


( 3 ) 


1=1 


nq , y v n, . T 

—j log( 27 r) - - log | ^ 


E lo § 


1 = 1 


exp <J yj rj, - 1 T log[l + exp(r;d] - - bf^ % }• dh,, 


where 0 denotes the vector of model parameters, that is 0 = [a 0 , vec(a), vec^)] 7 . It is easily 
shown that the score vector equals 

S(0) = ® = E^iog/(yu0) = E i _s,(O;b I )Mb l |yj;0)db/ = E^b| y [s I (®;b,)], (4) 


do 


1=1 


1=1 


IR1 


1=1 


where S)(0;b/) denotes the complete-data score vector given by 
<91og/(yz, R; 6)/d6 = d [logg(y; | b^; 9) + log/i(b;)] /d9. In words, the observed data score vector 
is expressed as the expected value of the complete-data vector with resp ect to h(bi \ yg 9), that is 


the posterior distribution of the latent variables given the observations (Louis, 


1982). This implies 


that eq. ((4]) plays a double role. If the score equations are solved with respect to 9, with h(hi \ yi\9) 
fixed at the 0-value of the previous iteration, then this corresponds to the EM algorithm, whereas, 
if the score equations are solved with respect to 6 considering h(hi \ yi',9) also as a function of 
0, then this corresponds to a direct maximization of the observed data log-likelihood ^(0). As we 
shall discuss further, based on this appealing feature, the estimators derived by applying either of 
these two algorithms will share the same theoretical properties. 

Eq. g]) involves multidimensional integrals which cannot be solved analytically, except in 
presence of observed normal variables. An approximation of these integrals is needed, on which 
the bias and variance of resulting estimators will depend. Several numerical techniques have been 
proposed, but they produce unreliable solutions or become computational unfeasible as the number 
of latent variables inc reases. In this paper, we prop ose a generalized Dimen sion Reduction Method 


(DRM) developed by Rahman an d Xu (12004 ) and Xu and Rahma n (12004 1. As illustrated in the 


following, this method provides more accurate estimates than the Laplace approximation and, at 
the same time, it avoids the main computational limitations of the adaptive quadrature. 
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2.1 The generalized DRM for multidimensional integration 


The dimension reduction method (R ah man and Xu, 


2004; 


Xu and Rahm a n . 


2004) is applied to 


approximate the log-likelihood function ((3j) . It is based on a reduction of the dimensionality of 
the integral through an additive decomposition of the ^-dimensional integrand into at most s- 


dimensional terms, where s < q. This decomposition is know n as Cu t - Hig 


l Dimensional Model 


Rabitz and Alisl ( 11 999) for a more detailed 


Representation (HDMR), and we refer the reader to 
description of the approach. 

For unbounded integration domains, the method requires the integrals to be expressed in expected 
value form as follows 

E hl [v(b*)} = [ v(b *)/n(b*)db*, (5) 


where z/(b*) is a continuous, differentiable, real-valued function of mutually independent variables 
b* = [&*, • • • ,b*] T , and h x is an appropriate density function. In order to rewrite the integral ([T]) 
as an expected value of the form (l5l) , we follow an approach similar to the one used to derive the 
Laplace approximation. In this regard, we perform the following transformation of the integrand 


f(yi) 


exp[L(b;)]<ib z , / = l,...,n, 


( 6 ) 


being L(b;) = log g(y; | b;) + log/r(b;). Then, we consider the Taylor expansion of L(b*) around 
its mode b t = arg max b;eK ,j L(b/). That is, 


Lfa) = L( b z ) + i(b ; - bi) T L"(bi)(bi - bi) + ^(b,), 


(7) 


where v\{b /) includes all the terms of the expansion of order higher than two. Substituting eq. (IT|) 
into eq. (EJ), the integral results 



p[L(bi))clhi 


(2n) q/2 | S; | 1/2 exp[L(bi)] 


exp[z/i(bi)]0(b ; ; b ; , S;)db;, 
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where 0(bb;, S;) represents a multivariate normal density function whose mean vector is given by 
the mode b/ and its covariance matrix is minus the inverse of the Hessian matrix of L(h[) evaluated 
at its mode, that is Yq = — L"(hi). Eq. (J6J) can be rewritten as 

f(yi) = (27 t )^ /2 | £, | 1/2 exp[L(bj)]^ [exp(i/i(bj))]. (8) 


The DRM is then applied to approximate the expected value that appears in eq. (jHJ), after the 
following change of variables 


exp[zq(Qb* + b;)]0(b*;O,/)db z * 


g(yi 1 

Qb? + b,)/i(Cib? + b,) 

exp 

'l( b,)' 

exp [-Ibf h*] 


0(b?;O,/)db?, 


(9) 


where C/ is obtained from the Cholesky decomposition of the covariance matrix £;, that is £; = 
QCf, and the standardized variables b* are derived as bj* = C^ 1 (b ; — bj). 

Based on the Cut-HDMR, the function exp[zq(Czb* + b/)] in (J9J) is approximated by the sum of s 
orthogonal terms as follows 


exp[iq(C z b* + b;)] 



Y exp^RQb^ +bj)] 


( 10 ) 


where b^ k . = [0,..., 0, b kl , 0,...,..., 0, b* k ., 0,..., 0] T is a g-dimensional vector characterized 
by i free independent variables, b* k ., and (q — i ) values hxed equal to zero. Hence, the 

multidimensional integral dehned in expression (J9]) can be approximated as the sum of univariate, 
bivariate up to s-dimensional integrals as 


exp [vi{Cih* + b/)]0(b*; 0, 1)db* 
* (-1) ; 


/ , \ 


( 

, \ 



q- i 

+£(- i )-‘ 

q — i — 

1 

H / exp 

-L( b,)' 

v s ) 

1=1 

^ s-i 

) 

k!<...<ki 



( 11 ) 

-L( h i) 9(yi\C l b* klt '" tk . + b l ) 


HCib* kl: k . + b,) exp 


-b 


*T i_ * 


<K b .fcjcfbfci. 













where each i-dimensional integral, for i — 1,... ,s, can be easily computed using classical Gauss- 
Hermite (GH) quadratures. The approximated marginal density function results 


/( Yi) = | % | 1/2 e 1 


11/2 e L(b t ) 

(-i r 

f i ^ 
q -1 

s 

( 

q — i — 

A 



v s J 

2—1 

^ s - i 

/ 


( 12 ) 


Y Y e L(hl)7Tl/2 9(yi I b tlr .., w )h(b 




where ^2 ti t . = Ylt?=i ■ ■ ■ S” 9 =iG — 1,..., s, being n q the number of quadrature points se¬ 
lected for each latent variable, b= -\/2Q(0, b tl , 0,..., 0, b tj , 0,..., 0, b ti , 0) T + b ; , and = 

exp \b+ ], where b tj and vo tj ,j = 1 are the classical GH nodes and weights, respectively 


flStroud and Sechrest 


1966 ). 


An important feature of the Cut-HDMR flTTH) is that e a ch te rm in the summation has a clear 


mathematical meaning (Li et al 


2001 


Xu and Rahman, 


20041), which is especially evident if 


exp[z/!(Qb z * + bj)] can be expanded as a convergent Taylor series around 0. Indeed, it is easy 
to prove that the first term, obtained for i = 0, is the constant term of the Taylor series, the 
second term, derived for i — 1 , is the sum of all the Taylor series terms which only contain the 
variable b* k , k = 1 while, when i = 2, the term is the sum of all the Taylor series terms 

which only contain variables b* ki , k\ = 1 ,..., q, and b * k2 , k 2 = 1 ,..., q, and so on. In particular, 
Xu and Rahman (2004]) proved the following result: 


Theorem 1 For any q-dimensional function ^(b*), if it admits an s-variate Cut-HDMR of the 
form m, then this approximation, denoted by u s , consists of all terms of the Taylor series of 
z/(b*) that have less than or equal to s variables, i.e. 


where 


S 



w=0 
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tin 


E E 


()■>' ' i2 . 


n, *<*<...<*. Km - - - W og-eC • • • »S? 


Off •••»£■. 


b*=0 


It has to be noticed that the approximated Cut-HDMR expansion provides a better approxi¬ 
mation and convergent solution of exp[z/ 1 (C;b i * + b/)] than any truncated Taylor series expansion 
that contains a finite number of terms, generally of first and second order. Indeed, from eq. m3, 
it can be noticed that, when s — 0, the quantity among squared brackets reduces to one, such that 
the DRM approximated function results 


Kyi) = (2ir) 5/2 I £| | 1/2 e 1 ^', 


(13) 


that is the Laplace approximation of the integral ([Q). On the other hand, when the integral is 
g-dimensional, that is no dimensional reduction is performed, the approximation results 


Kyi) = 22 | % i 1 / 2 Y g(yi I b 


1 15 • • • 


)h(b* t 


w. 


. w 


tn ^ 


(14) 


t\tq 

where ^ = {b* tlP ..., b* tq l ) 7 = V2Ci(b h ,..., b tq ) T + b t and w* tj = w tj exp[6 t 2 .] are the adaptive 
Gauss-Hermite nodes and weights, respectively. In other words, the dimension reduction method 
provides an approximation of the marginal density function /(y/) that is more accurate than the one 
derived by applying the classical Laplace approximation, due to the inclusion of higher (than two) 
order terms in the Taylor series expansion of L(hi). The adaptive Gauss-Hermite approximation 
(fT4l) is obtained when no reduction is performed, that is when the integral dimension is equal to 
the number of latent factors. As shown in the subsequent sections, the selection of a value of s 
less than q in the DRM approximation avoids the main computational limitations of the adaptive 
quadrature, preserving, asymptotically, the same accuracy of the estimates. 

Based on (1121) . the approximation of the log-likelihood function becomes 
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m = ^log/(y, ; e) 


(15) 


1=1 

nq_] 


n 

2 


n 

i Eii + lo s 

(-i r 

f , \ 
q- 1 

s 

< i \ 

q — i — 1 

i=i 


v s ) 

i—1 



ni/2 s(yi I 

<C...<CAjj £i 5 ...j£j 


^ 


In this study, the maximization of expression (fl^D is performed using a quasi-Newton algorithm in 
which the gradient and the Hessian matrix are obtained using numerical derivatives. 


3 Statistical properties of the DRM-based estimators 

Motivated by the real data application, we derive the properties of the DRM-based estimators in 
GLLVM for binary data, where q latent factors hi account for the associations among the observed 
variables. 

To investigate the properties of the DRM-based maximum likelihood estimators 6, the asymptotic 
behavior of the approximation (fT5l) is analyzed by taking into account for its equivalent Taylor 


series expansion provic 


(19 94 ) and 


Bianconcini 


ed in Theorem CD Our arguments are similar to those of 


Liu and Pierce 


(120141 ) who analyzed the consistency of the adaptive Gauss-Hermite based 


estimators in mixed models and GLLVM, respectively. A detailed derivation of the error rate of 
the approximation is given in Appendix A. 

Proposition 1 (Consistency) Let 6 q E O denote the true parameter value, then, under suitable 
regularity conditions, 


(0 - 0 O ) = O p 


max n 


r 1/2 ,p-W +1 ] 


(16) 


Thus, 6 is consistent as long as both n and p grow to oo. A formal proof of Proposition |T| is given 
in Appendix B. The n -1 / 2 term comes from the standard asymptotic theory, whereas the p~[^ +1 ] 
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term derives from the DRM approximation. For n q > 3, the DRM-based estimators are more 
accurate than the classical Laplace ones, that are of order 0{jp~ l ). In particular, the DRM-based 
estimators share the same accuracy of the AGH ones, but the dimension reduction method avoids 


the main comp utational limi t ations of t 
As discussed by 


Huber et ah 


(2009]) and 


re latter. 


Bianconc inil (2014) for the Laplace-based and AGH-based 


estimators, respectively, when numerical techniques are used, such as the DRM, the corresponding 
estimators are not maximum likelihood estimators because of the approximation, but belong to 
the class of M-estimators. Hence, 

Proposition 2 (Asymptotic normality) If 0 o is an interior point of the parameter space 0 


and H(Qq) = —E 


a 2 i(flo) 

dead 1 


is nonsingular, the DRM based estimators are asymptotically normal, 


i.e. 


- o„)-> d mvn(o, H(e 0 )- 1 u(B a )[H(e 0 )- 1 } T ) 


(17) 


with U(0 q) = E 


d£( 0 o) de(Oo) - 

80 do 


being I(9 0 ) given in eg. 1131) . 


4 Finite sample results 

The performance of the proposed method is evaluated in finite samples by means of a Monte Carlo 
simulation study for the GLLVM in presence of binary data. This is a particular case of the model 
described in Section 1, where we consider b/ = (z u ,..., z q i) T as a vector of q factors that account 
for the associations among the observed variables, and we assume to be a correlation matrix for 
identifiability reasons explained below. 

We consider two different scenarios corresponding to models with two and four correlated latent 
variables, respectively. The choice of a reduced number of latent variables in the first scenario allows 
us to compare the performance of DRM with other approximation techniques without incurring 
in computational problems that, as known, can affect the quadrature-based methods when the 
dimension of integrals is high. In both scenarios, the sample size is set to 500 individuals. 100 
replications are considered per each condition of the study. 
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In the first scenario, the aim is to compare the performance of the DRM method with s — 1, 
that means we approximate the two-dimensional integral as a sum of univariate integrals, with the 
classical Laplace approximation, that corresponds to s = 0, and with the AGH quadrature-based 
method, that occurs when the approximated integral is of the same dimension as the number of 
latent variables, in this case two. For both DRM and AGH, an increasing number of quadrature 
points is considered, that is n q equal to 5, 7, and 11. It has to be noticed that for this particular 
model AGH is feasible for all the selected number of quadrature points. 

The true values of the model parameters have been selected so that at least four ( q 2 ) constraints 


are imposed to obtain model identification flJoreskoa. 


19691 ). In particular, two constraints are 


obtained from the correlation matrix T, being the main diagonal elements equal to one. Fixing 
the variances of the latent variables equal to the unity also ensures the identification of their 
scale. The free correlation parameter ^ 12 is set equal to 0.469. The remaining constraints are 
obtained by assuming a simple structure for the matrix of the loadings, that is by partitioning the 
manifest variables into two non-overlapping groups, each indicative of a different latent variable. 
The true values of the free loadings are generated from a log-normal distribution and are equal to 
an = 2.697, «i 2 = 0.933, a 2 3 = 1.232, and a 2 4 = 1.634. Starting values have been randomly chosen 


in a suitable range for the parameters. 

Table [Q reports the estimates of the two-factor model parameters derived by applying the 
Laplace approximation, DRM and AGH, the latter both based on five quadrature points (n q = 5), 
and denoted by DRM5 and AGH5, respectively. The results are illustrated in terms of Relative 
bias (Rbias) and Standard deviations (S.d.) of the Monte Carlo estimates, as well as in terms 
of computational performance of the algorithms. As for the latter, we can observe that the per¬ 
centage of samples for which the algorithm reaches convergence properly (% cv ) increases with s, 
varying from 91% under the Laplace approximation to 100% for AGH5. The average log-likelihood 
values over the one hundred replications increase with s, whereas the average number of function 
evaluations (nr feval) and the average computational time in seconds (avg time ) are the lowest 
when the Laplace approximation is applied and increase with s. This is clearly due to the in- 
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Table 1: Monte Carlo results for the two-factor model under Laplace, DRM and AGH with n q = 5, 
n = 500, 100 replications 




Laplace 



DRM5 



AGH5 




s=0 



S=1 





True 

Mean 

RBias 

S.d. 

Mean 

RBias 

S.d. 

Mean 

RBias 

S.d. 

an = 2.697 

1.168 

-0.567 

0.165 

1.908 

-0.293 

0.303 

1.923 

-0.287 

0.471 

ol\2 = 0.933 

1.027 

0.101 

0.195 

1.110 

0.190 

0.231 

1.075 

0.153 

0.242 

a23 = 1.232 

1.161 

-0.058 

0.221 

1.321 

0.072 

0.250 

1.342 

0.089 

0.384 

a24 = 1.634 

1.121 

-0.314 

0.189 

1.454 

-0.110 

0.288 

1.661 

0.023 

0.478 

ipi2 = 0.469 

0.682 

0.453 

0.138 

0.455 

-0.030 

0.099 

0.519 

0.105 

0.049 

Avlog-lik 


-1225.87 



-1216.95 



-1213.95 


%cv 


91 



93 



100 


Nrfeval 


16.57 



24.37 



31.17 


Avtime (s) 


38.82 



66.78 



87.71 



creasing dimension of the integrals to be approximated, being unidimensional when the DRM5 is 
applied and bidimensional for the AGH5. In terms of accuracy of the estimates, we can observe 
that the relative bias of the estimates derived by applying the Laplace approximation is quite high 
for almost all the loadings and for the correlation estimate, whereas they generally improve for 
increasing values of s. As for the standard deviation of the estimates, considering AGH5 as the 
“gold” standard, we can observe that the Laplace approximation tends to strongly underestimate 
the variability of the loading estimates, whereas it overestimates the variability of the correlation 
estimate. A similar behavior is observed for the standard deviation of the estimates under DRM5, 
even if they are closer to the ones obtained with AGH5 

The behavior of the estimators obtained under the different approximations is highlighted in 
Figure [U where their estimated densities are illustrated. It is clearly evident the different perfor¬ 
mance of the methods particularly for «u, « 24 , and for The better performance of the most 
informative AGH5 compared to the other methods is particularly evident in the reduction of the 
bias. For the correlation parameter, the accuracy of Laplace is noticeably worse than the other 
methods. DRM5 seems to be a good compromise between the Laplace approximation and the 
AGH5 in terms of both accuracy of the estimates and computational burden of the algorithm. 
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In Table O the results of the two-factor model estimation based on the dimension reduction method 
and the AGH quadrature are compared when seven and eleven quadrature points are used in both 
the numerical techniques. DRM7 and AGH7 denote the two techniques when seven quadrature 
points are used (n q = 7), whereas DRM11 and AGH11 indicate the techniques when eleven nodes 
are considered (n q = 11). For both seven and eleven nodes, DRM appears to be superior to 
the AGH in terms of average number of function evaluations and computational time. As in the 
previous case, the standard deviations of the loading estimates obtained with DRM7 are smaller 
than those obtained with AGH7, and those obtained with DRM11 are smaller than those obtained 
with AGH11. For the correlation estimate DRM overestimates its standard deviation compared to 
AGH in both cases, n q = 7 and n q = 11. 

aq i ol 12 




«23 



«24 


Vh2 




Figure 1: Estimated densities of the loadings and the correlation estimates for the two-factor 
model under Laplace, DRM5 and AGH5. The dashed vertical line indicates the true value of the 
parameter. 
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Table 2: Monte Carlo results for the two-factor model under DRM and AGH with n q = 7,11, 
n = 500, 100 replications 




DRM7 



AGH7 



DRM 11 



AGH 11 




S=1 






S=1 





True 

Mean 

RBias 

S.d. 

Mean 

RBias 

S.d. 

Mean 

RBias 

S.d. 

Mean 

RBias 

S.d. 

an = 2.697 

2.100 

-0.222 

0.457 

2.214 

-0.179 

0.553 

2.525 

-0.064 

0.899 

2.995 

0.110 

1.187 

a 12 = 0.933 

1.088 

0.166 

0.275 

1.044 

0.119 

0.259 

1.024 

0.097 

0.256 

0.990 

0.061 

0.269 

a23 = 1.232 

1.288 

0.045 

0.313 

1.340 

0.087 

0.381 

1.308 

0.061 

0.259 

1.325 

0.075 

0.407 

a24 = 1.634 

1.432 

-0.124 

0.374 

1.781 

0.090 

0.573 

1.475 

-0.097 

0.362 

1.981 

0.213 

0.924 

012 — 0.469 

0.471 

0.002 

0.132 

0.500 

0.065 

0.105 

0.465 

-0.009 

0.150 

0.479 

0.021 

0.111 

Avlog-lik 


-1216.80 



-1213.50 



-1215.69 



-1213.16 


%cv 


99 



100 



99 



100 


Nrfeval 


34.13 



34.61 



37.75 



42.51 


Avtime (s) 


92.55 



121.44 



124.38 



192.32 



In the second scenario, we evaluate the performance of DRM for all possible values of s, that is 
zero, one, two, three. This allows to compare, also in this case, the results of DRM with Laplace 
(s = 0) and AGH. The latter is feasible only if we set the number of nodes for each dimension 
equal to five or seven. We consider the solution for n q = 5 since there are no relevant differences 
with n q = 7. Also in this scenario, we consider a simple structure for the matrix of the loadings 
that is characterized by two free parameters per each factor. In particular, the loading parameters 
are fixed to an = 2.697, a ±2 = 0.933, «23 = 1.659, «24=1.241, a 35 = 1.486, «36 = 1.156, 047 = 

0.756, a 48 = 0.884 and the correlation parameters are fixed to 0 i2 = 0.470,-013 = 0.534,0 14 = 

O.48O,0 2 3 = 0.405,024 = 0.440,034 = 0.571. 

In the second scenario, we evaluate the performance of DRM for all possible values of s, that is 
zero, one, two, three. This allows to compare, also in this case, the results of DRM with Laplace 
(s = 0) and AGH. The latter is feasible only if we set the number of nodes for each dimension 
equal to five or seven. We consider the solution for n q = 5 since there are no relevant differences 
with n q = 7. Also in this scenario, we consider a simple structure for the matrix of the loadings 
that is characterized by two free parameters per each factor. In particular, the loading parameters 
are fixed to an = 2.697, au = 0.933, a 2 3 = 1.659, a 24= l.241, a 35 = 1.486, «36 = 1.156, a 4 7 = 

0.756, a 4 s = 0.884 and the correlation parameters are fixed to 0 4 2 = 0.470,013 = 0.534,0 14 = 

0.480,023 = 0.405,02 4 = 0.440,0 34 = 0.571. In Figure [2]the box-plots of the Monte Carlo estimates 
are illustrated. 

The accuracy of the parameter estimates is in line with the findings of the previous scenario. 
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Figure 2: Box-plots of the parameter estimates for the four-factor model under Laplace (s = 0), 
DRM (for s — 1,2, and 3), and AGH (s = 4), for n q — 5, n — 500, 100 replications. 


Table 3: Computational performance of the algorithms based on the Laplace approximation, DRM 
and AGH based on five quadrature points for the four-factor model with n = 500,100 replications 



Laplace 


DRM5 


AGH5 


s=0 

S=1 

s=2 

s—3 


Avlog-lik 

-2404.76 

-2398.89 

-2394.77 

-2394.16 

-2393.83 

%cv 

26 

84 

98 

100 

100 

Nrfeval 

26.27 

26.79 

28.32 

30.36 

36.98 

Avtime (s) 

240.75 

290.12 

438.43 

902.06 

909.13 


17 






























































































































The bias of the estimates reduces as s increases, whereas the variability of the loadings is underesti¬ 
mated. In terms of correlation estimates, the bias decreases as s increases, with a similar behavior 
for the variability of the estimates when s is equal to two and three and for the AGH case. It is 
worth noting that, in this model, the bias and variability of the estimates obtained with DRM5 
with s — 2 and with s = 3 are very close, and they are coincident for DRM5 with s = 3 and AGH5 
for the most of the parameter estimates. Hence, the information that we gain passing from s — 2 
to s = 3 is small and passing from s = 3 to the AGH approximation is almost irrelevant. The 
computational performance of the algorithm based on the Laplace approximation, DRM with s = 2 
and 3, and AGH is analyzed in Table [3j We can observe that the Laplace method performs very 
poorly in terms of samples in which the algorithm converges properly (only 26%). This percentage 
rises to 84% under DRM5 with s = 1, to 98% under DRM5 with s = 2 and it is 100% under DRM5 
with s = 3 and under AGH5. But if we look at the computational time we can observe that, on 
average, when s = 2, the algorithm requires half of the time to converge compared with s = 3 and 
AGH5 that are very similar also in terms of computational time. 

5 HRS data analysis 

Every two years the Institute for Social Research at the University of Michigan conducts a national 
representative Health and Retirement Study (HRS) funded by the National Institute on Aging. 
The study is designed with the aim to investigate the health, social and economic implications of 
the aging of the American population (Juster and Suzman, 1995). Between August and December 
2001, a random sample of 856 subjects from all regions of the country aged 70 or older was 
selected from the total sample of approximately 22000 HRS individuals in order to participate 
to a more extensive study on Aging, Demographics And Memory Study (ADAMS). The latter 
represents the first population-based study of dementia in the United States (the data are available 
on the website http://www.rand.org). The participants received a thorough in-home clinical and 
neuropsychological assessment that allowed researchers to estimate the prevalence, predictors, and 
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outcomes of dementia in the U.S. elderly population. Three waves of data were collected from 2001 
to 2008. 

The aim of our analysis is to evaluate the performance of the proposed DRM approach by first 
fitting a GLLVM for binary data to the ADAMS data available at the first time point and then by 
conducting a longitudinal analysis over the available time points of one of the dimensions detected 
in the first analysis. 


5.1 Factor analysis of the HRS data set 

The individuals selected for the ADAMS study were asked to answer to the items of the Section 
D of the HRS questionnaire devoted to the cognitive assessm ent. A detailed description of the 


measures of the cognitive functioning can be found in 


Ofstedal et al. 


( 20051 ). For our analysis, we 


selected a subset of these measures corresponding to the following binary items: 

1. Please tell me today’s date: year [MSE1] 

2. Please tell me today’s date: date [MSE2] 

3. Please tell me today’s date: day of week [MSE3] 

4. Please tell me today’s date: month [MSE4] 

5. Let’s try some subtraction of number: 100 minus 7 equals what? [SER71] 

6 . And 7 from that? [SER72] 

7. And 7 from that? [SER73] 

8 . And 7 from that? [SER74] 

9. And 7 from that? [SER75] 

10. Who is the President of the United States right now? [PRE] 


11. Who is the vice President? [VPRE] 
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12. Please try to count backward as quickly as you can from 20 to 11 without error [BW20] 

13. Please try to count backward as quickly as you can from 86 to 77 without error [BW86] 


For all of them, 1 means correct answer and 0 m eans 


confirmatory analysis on the HRS data ( McArdle et ah 


wrong answer. Previou s exploratory and 


2007|; 


M cA rdle. 


20101 ) highlighted that 


these items are measure of four latent factors (sub-scales) that are expression of the mental status 
of the interviewed people. In more detail, items from 1 to 4 are measures of the sub-scale Dates 
(DA), items 10 and 11 are measures of the sub-scale Names (NA), items 12 and 13 are measures of 
the sub-scale Backward Counting (BC) and items from 6 to 9 are measures of the sub-scale Serial 7’s 
Test (S7). Thus, we fitted a four-factor model as defined in (J2J) , assuming a simple structure for the 
matrix of loadings and, for the identihability reasons discussed before, considering as correlation 
matrix. We allowed s to vary from zero to three. The solution for s equal to 0 corresponds to 
the Laplace approximation and the solution when no DRM is performed, that is the integral is 
four-dimensional, is obtained by applying the AGH. Using five and seven quadrature points, for 
both DRM and AGH, we obtained very similar results, thus Table [4] shows the results only for 


n q = 5. 


Table [4] gives the estimated loading parameters with associated asymptotic standard errors 
obtained from (j 1 7jl . The magnitude of the loading estimates is close for all values of s, whereas, in 
general, the smallest standard errors are obtained with s = 3 and AGH. All the loadings have values 
greater than one and are significantly different from zero independently on the approximation used. 
Thus, the items are all relevant measures of the corresponding underlying factors. Table ED reports 
the estimated correlation parameters, the value of the log-likelihood and the computational time 
for the DRM with each value of s and for the AGH. It is worth noting that the values of the 
correlation parameters are very close and, in some cases, they are coincident at the second digit 
for all s. The smallest standard errors are obtained for s = 1 and s = 3, but, also in this case, 
we can observe that the significance of the estimates does not change with s. As expected, the 
log-likelihood increases with s, whereas the computational time for s = 3 and AGH is more than 
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twice than the one obtained with s = 2. Thus, we can conclude that DRM 5 with s = 2 seems to 
provide the best compromise in terms of accuracy of the estimates and computational performance. 


5.2 Longitudinal analysis of the HRS data set 


Multidimensional longitudinal data can be challenging to analyze since a large number of random 
effects/latent variables can be required to take into account for different sources of variability 
present in the data. In this section, we analyze the items SER71 SER72 SER73 SER7A SER75 
of the ADAMS data set observed at three different time points (2001-2003, 2002-2005, 2006-2008). 
The aim is to investigate how the working memory and mental processing task in which respondents 
counted backward from 100 by 7s changes over time. 

We applied the GLLVM for multidimensional longitudinal data proposed by 


and 


Cagnone et al. 


Dunsonl (2003) 


( 20091 ) to analyze p binary variables observed at T different time points, such 


that y i = (yit,V 2 t, ■ ■ ■ ,y P t ) is a pT- dimensional vector. Beyond the association between several 
items at the same time point, their dependence over time as well as the variability of the same 
item at different time points have to be considered. The latter is accounted by p item-specific 
random effects u/ = (uu ,..., u p i ) T , whereas the former is explained by means of T time-dependent 
latent factors z/ = (zu, • • •, Zti), such that the vector of latent variables b/ = (z/, ip) r is (p + T)- 
dimensional. The correlation of the latent variables Z{ over time is modeled through a first order 
non-stationary autoregressive process of parameter cj) and Var(zi) = a 2 . Hence, the covariance 

T 0 


matrix of b;, Sh, is a block matrix 


with = diagi = i : ____ p {a 2 .}. The elements of T are 


0 

the variances and covariances of the time-dependent latent variables z t , that is 7 t>t = Var(z t ) = 
(f) 2 ^- 1 )a\ +1 (t > 2 ) and 7 t ,v = Cov(z t , z t >) = (j) t+t '~ 2 al + I(t > 2 ) )T)l=o 4 > t '~ t+2k , where 


/(.) is the indicator function and t < t'. 

The measurement part of the model has the same specification given in eq. (l2]l . but the linear 
predictor is now characterized by a pT - dimensional vector cko of item- and time-specific intercepts, 
and a loading matrix ct of dimension (jpT x (p + T)) whose generic submatrix referring to the j-th 
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item is given by [ck,-It O [j, It]], being It the identity matrix of dimension T x T, and O [j, It] a 
null matrix of dimension T x (T + 1) whose j'-tli column is substituted by a T-dimensional vector 
of ones. 

It can be noticed that, in the longitudinal setting, the number of latent variables increases linearly 
with the number of observations. In this particular example, the number of latent variables and 
random effects is equal to eight (q — 8), being p = 5 and T = 3. It follows that AGH is unfeasible 
also using five quadrature points per each dimension, hence only DRM can be applied in this case. 
We consider a fully constrained model (equal thresholds and equal loadings over time) in order to 
obtain the same metric (origin and scale) of the latent variable over the three time points. The 
resulting strict measurement invariance allows comparisons of the factor across time. 

As for the choice of s, we started fitting the model with s = 1 and we increased its value until 
the mean of the absolute differences between parameter estimates (At’(A)) became sufficiently 
small (order of 10~ 3 ). 

In Table [6] we report the estimated model parameters under DRM with five quadrature points 
for s = 1,2, 3,4. It is interesting to notice that, for this particular model, apart from </>, all the 
parameter estimates change sensibly from s = 1 to s = 2 (Av(Ai2)=0.164). On the contrary, they 
are all very close for s = 3 and s = 4 (At>(A 34 )=0.007), indicating that the additional information 
included in the likelihood for s > 3 is almost irrelevant. From a computational point of view, there 
is a substantial difference in the time to convergence for increasing values of s. For example, the 
computational time for s = 4 is more than six times the computational time for s = 3. It is evident 
that in this case s = 3 can be considered as the reference solution. 

Looking at the results obtained when s equals 3 (Table [7]), we can notice that the variances of 
the random effects are all quite small and not significantly different from zero, indicating that 
there is no relevant heterogeneity of individuals on each item over time. On the other hand, the 
autoregressive parameter is very high and significant (</> = 0.99 with standard error equal to 0.15), 
indicating a highly persistent latent process over time. 
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Table 4: Estimated factor loadings with asymptotic standard errors in brackets for the fourth factor model, s=0,l,2,3,4, ADAMS 


data set 




Lapl 

ace 







DRM5 







AGH5 




s= 

0 



s- 

=1 



s- 

=2 



s=< 

3 







DA 

S 7 

NA 

BC 

DA 

S 7 

NA 

BC 

DA 

S 7 

NA 

BC 

DA 

S 7 

NA 

BC 

DA 

S7 

NA 

BC 

Smsei 

1.92 




2.10 




1.89 




1.95 




1.96 





(0.43) 

- 

- 

- 

(0.29) 

- 

- 

- 

(0.31) 

- 

- 

- 

(0.17) 

- 

- 

- 

(0.34) 

- 

- 

- 

S MSE2 

1.45 

- 

- 

- 

1.60 

- 

- 

- 

1.49 

- 

- 

- 

1.51 

- 

- 

- 

1.51 

- 

- 

- 


(0.52) 

- 

- 

- 

(0.42) 

- 

- 

- 

(0.51) 

- 

- 

- 

(0.24) 

- 

- 

- 

(0.39) 

- 

- 

- 

S MSE3 

1.84 

- 

- 

- 

2.03 

- 

- 

- 

1.81 

- 

- 

- 

1.86 

- 

- 

- 

1.87 

- 

- 

- 


(0.51) 

- 

- 

- 

(0.27) 

- 

- 

- 

(0.43) 

- 

- 

- 

(0.14) 

- 

- 

- 

(0.22) 

- 

- 

- 

S MSE4 

1.42 

- 

- 

- 

1.54 

- 

- 

- 

1.41 

- 

- 

- 

1.44 

- 

- 

- 

1.45 

- 

- 

- 


(0.45) 

- 

- 

- 

(0.26) 

- 

- 

- 

(0.35) 

- 

- 

- 

(0.16) 

- 

- 

- 

(0.17) 

- 

- 

- 

S SER71 

- 

1.76 

- 

- 

- 

1.82 

- 

- 

- 

1.77 

- 

- 

- 

1.77 

- 

- 

- 

1.78 

- 

- 


- 

(0.45) 

- 

- 

- 

(0.36) 

- 

- 

- 

(0.40) 

- 

- 

- 

(0.26) 

- 

- 

- 

(0.39) 

- 

- 

«SER72 

- 

1.60 

- 

- 

- 

1.69 

- 

- 

- 

1.73 

- 

- 

- 

1.74 

- 

- 

- 

1.74 

- 

- 


- 

(0.83) 

- 

- 

- 

(0.65) 

- 

- 

- 

(0.39) 

- 

- 

- 

(0.28) 

- 

- 

- 

(0.24) 

- 

- 

«SER73 

- 

1.18 

- 

- 

- 

1.26 

- 

- 

- 

1.30 

- 

- 

- 

1.30 

- 

- 

- 

1.28 

- 

- 


- 

(0.64) 

- 

- 

- 

(0.36) 

- 

- 

- 

(0.29) 

- 

- 

- 

(0.18) 

- 

- 

- 

(0.19) 

- 

- 

«SER74 

- 

2.23 

- 

- 

- 

2.28 

- 

- 

- 

2.29 

- 

- 

- 

2.30 

- 

- 

- 

2.31 

- 

- 


- 

(0.91) 

- 

- 

- 

(0.42) 

- 

- 

- 

(0.63) 

- 

- 

- 

(0.35) 

- 

- 

- 

(0.40) 

- 

- 

«SER75 

- 

1.71 

- 

- 

- 

1.74 

- 

- 

- 

1.84 

- 

- 

- 

1.84 

- 

- 

- 

1.84 

- 

- 


- 

(0.76) 

- 

- 

- 

(0.48) 

- 

- 

- 

(0.61) 

- 

- 

- 

(0.30) 

- 

- 

- 

(0.38) 

- 

- 

S PRE 

- 

- 

1.12 

- 

- 

- 

1.35 

- 

- 

- 

1.12 

- 

- 

- 

1.17 

- 

- 

- 

1.18 

- 


- 

- 

(0.91) 

- 

- 

- 

(0.27) 

- 

- 

- 

(0.41) 

- 

- 

- 

(0.20) 

- 

- 

- 

(0.23) 

- 

S VPRE 

- 

- 

1.39 

- 

- 

- 

1.38 

- 

- 

- 

1.43 

- 

- 

- 

1.45 

- 

- 

- 

1.45 

- 


- 

- 

(1.99) 

- 

- 

- 

(0.26) 

- 

- 

- 

(0.44) 

- 

- 

- 

(0.27) 

- 

- 

- 

(0.22) 

- 

S BW20 

- 

- 

- 

1.43 

- 

- 

- 

1.57 

- 

- 

- 

1.41 

- 

- 

- 

1.46 

- 

- 

- 

1.47 


- 

- 

- 

(0.37) 

- 

- 

- 

(0.23) 

- 

- 

- 

(0.59) 

- 

- 

- 

(0.23) 

- 

- 

- 

(0.22) 

S BW86 

- 

- 

- 

1.05 

- 

- 

- 

1.07 

- 

- 

- 

1.13 

- 

- 

- 

1.19 

- 

- 

- 

1.19 




" 

(0.21) 

- 

- 

- 

(0.28) 

- 


- 

(0.36) 

- 

- 

- 

(0.20) 

- 

- 

- 

(0.26) 



Tabic 5: Estimated correlation parameters with standard errors in brackets for the fourth factor 
model, s=0,l,2,3,4, ADAMS data set 



Laplace 


DRM5 


AGH 5 


s=0 

S=1 

s=2 

s=3 


012 

0.64 

0.64 

0.64 

0.64 

0.64 


(0.07) 

(0.02) 

(0.06) 

(0.01) 

(0.27) 

013 

0.91 

0.92 

0.92 

0.92 

0.92 


(0.49) 

(0.01) 

(0.01) 

(0.01) 

(0.04) 

023 

0.55 

0.51 

0.57 

0.55 

0.55 


(0.16) 

(0.01) 

(0.03) 

(0.02) 

(0.10) 

014 

0.90 

0.89 

0.89 

0.89 

0.89 


(0.31) 

(0.02) 

(0.02) 

(0.01) 

(0.07) 

■024 

0.85 

0.85 

0.83 

0.81 

0.81 


(0.04) 

(0.02) 

(0.08) 

(0.03) 

(0.09) 

034 

0.76 

0.77 

0.74 

0.74 

0.74 


(0.04) 

(0.03) 

(0.04) 

(0.03) 

(0.08) 

log-lik 

-1413.01 

-1410.90 

-1410.99 

-1409.60 

-1409.54 

time (s) 

217.83 

157.82 

238.80 

592.02 

552.10 


6 Conclusions 


One of the main problems in the estimation of GLLVMs is that integrals involved in the likelihood 
do not have an analytical solution due to the presence of latent variables. A “gold” standard 
approach is represented by the Adaptive Gauss-Hermite quadrature, that is known to provide 
quite accurate estimates. However, it becomes computational unfeasible in presence of a large 
number of factors. A typical case is represented by GLLVMs for longitudinal data where the 
number of latent variables/random effects increases proportionally with the number of items. 

In this paper, we have proposed a new computational approach that overcomes these limitations 
since it decomposes the (/-dimensional integral into the sum of 1 , 2 ,..., ,s-dimensional integrals, 
being s < q, that can be easily approximated using classical Gauss-Hermite quadrature techniques. 
We have shown that, even if DRM uses less information than AGH, the DRM estimators share 
the same asymptotic properties of the AGH ones. This is due to the fact that DRM is based on 
an expansion of the integrand, known as Cut-HDMR, that provides a better approximation than 
any truncated Taylor series expansion that contains a finite number of terms, generally of first 
and second order. In this regard, the DRM-based estimators are more accurate than the classical 
Laplace ones, that corresponds to the case of s equal to 0. On the other hand, we have shown that, 
in finite samples, DRM and AGH produce similar bias in the estimates, but the former tends to 
underestimate the Monte Carlo variance. These discrepancies are more evident when s = 0, and 
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Table 6: Estimated model parameters under DRM5 for the non-stationary model and absolute 
relative differences between consecutive values of s, ADAMS data set 



s = 1 

s = 2 

s = 3 

s = 4 

A 12 

A 23 

A 34 

Q 1 

1.000 

1.000 

1.000 

1.000 

- 

- 

- 

OL2 

0.495 

0.551 

0.585 

0.585 

0.114 

0.061 

0.001 

5 3 

0.455 

0.508 

0.531 

0.531 

0.116 

0.045 

0.001 

a 4 

0.533 

0.586 

0.619 

0.619 

0.100 

0.057 

0.001 

S 5 

0.612 

0.678 

0.720 

0.720 

0.108 

0.062 

0.001 

S> 

0.995 

0.993 

0.994 

0.994 

0.002 

0.001 

0.001 


7.053 

6.052 

5.715 

5.721 

0.142 

0.056 

0.001 

K 

0.560 

0.444 

0.573 

0.562 

0.206 

0.291 

0.020 

a 2 1 

0.974 

1.157 

1.370 

1.360 

0.187 

0.185 

0.008 


0.466 

0.589 

0.741 

0.734 

0.265 

0.257 

0.009 


0.404 

0.512 

0.699 

0.688 

0.268 

0.364 

0.016 


0.241 

0.313 

0.404 

0.399 

0.301 

0.290 

0.014 

Av( A) 

- 

- 

- 

- 

0.164 

0.152 

0.007 

log-lik 

-520.26 

-518.90 

-517.75 

-517.79 

- 

- 

- 

time (s) 

937.51 

1290.06 

6697.60 

44246.78 

- 

- 

- 


Table 7: Estimated model parameters under DRM5 with standard errors in brackets for the non- 
stationary model, DRM5, s=3, ADAMS data set 



a 


SER71 

1.00 

0.57 (2.61) 

SER72 

0.59 (0.30) 

1.37 (0.89) 

SER73 

0.53 (0.34) 

0.74 (0.89) 

SER74 

0.62 (0.41) 

0.70 (1.06) 

SER75 

0.72 (0.31) 

0.40 (1.23) 


reduce as s increases. 

The advantages of DRM with respect to AGH are particularly evident from a computational 

point of view. When evaluating eq. (P using the AGH quadrature rule with n q points in each di- 

( \ 


mension, the total number of function or response evaluations is n q q . In contrast, Xu=o 


n 


s—i 

Q 


function evaluations are required using the dimension reduction method. Figure [3] shows how the 
ratio of these two function evaluation numbers varies with respect to s for q equal to 8 and 

n q equal to 5, 7 and 11. A reduction of the computational effort is achieved when the ratio 

/ \ 


e: =c 


n S q ■y n% < 1. It can be noticed that the amount of reduction depends on both s 


\ S ~ l ) 

and n q . In particular, for the univariate dimension reduction method the ratio has a magnitude of 
order ICR 4 , ICE 5 , and ICE' when n q = 5, 7 and 11, respectively. In the case of the bivariate DRM, 
the magnitude of the ratio is of order ICE 3 , ICE 4 , and ICE 11 when n q = 5,7 and 11, respectively. 
Furthermore, even if not shown here, it is evident that the reduction is dramatically enhanced as 
q increases. 

A peculiar point of DRM technique is the choice of the number of terms involved in the com- 
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Figure 3: Computational load, q — 8 latent factors, nq = 5, 7,11 quadrature points. 

putation. At this regard, we follow the same rule of thumb as for the choice of the number of 
quadrature points, that is we increase s until estimates stabilize. We underline that once the value 
of s is selected, we can further improve the accuracy of the estimates by increasing the number of 
quadrature points. 

The main limitation of the proposed approach is its dependence on the likelihood function of 
the selected model. Even if the theoretical framework considered in this paper is general, the 
implementation of DRM on different models requires ad hoc solutions since at the moment no 
commercial software is available yet. In this respect, we implemented DRM for GLLVMs in R, and 
the code is available from the authors upon request. 
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